information item
'Memory States' from Almost Nothing: Representing and Computing in a Non-associative Algebra
This note presents a non-associative algebraic framework for the representation and computation of information items in high - dimensional space. This framework is consistent with the principles of spatial computing and with the empirical findings in cognitive science about memory. Computations are performed through a process of multiplication-like binding and non-associative interference-like bundling. Models that rely on associative bundling typically lose order information, which necessitates the use of auxiliary order structures, such as position markers, to represent sequential information that is important for cognitive tasks. In contrast, the non-associative bundling proposed allows the construction of sparse representations of arbitrarily long sequences that maintain their temporal structure across arbitrary lengths. The non-associative nature of the proposed framework results in the representation of a single sequence by two distinct states. The L-state, generated through left-associative bundling, continuously updates and emphasises a recency effect, while the R-state, formed through right-associative bundling, encodes finite sequences or chunks, capturing a primacy effect. The construction of these states may be associated with activity in the prefrontal cortex in relation to short-term memory and hippocampal encoding in long-term memory, respectively. The accuracy of retrieval is contingent upon a decision-making process that is based on the mutual information between the memory states and the cue. The model is able to replicate the Serial Position Curve, which reflects the empirical recency and primacy effects observed in cognitive experiments. Keywords: Memory states, high-dimensional computing (VSA), nonassociative bundling, spatial computing, mutual information, Serial Position Curve T o appear in Neural Computation, V ol 37, Issue 6, June 2025 1 Introduction In essence, the perception of an object is initialised with the activation of a sensory pole. This sensory activation has a rapid decay and lasts for only a few milliseconds.
NewsInterview: a Dataset and a Playground to Evaluate LLMs' Ground Gap via Informational Interviews
Lu, Michael, Cho, Hyundong Justin, Shi, Weiyan, May, Jonathan, Spangher, Alexander
Large Language Models (LLMs) have demonstrated impressive capabilities in generating coherent text but often struggle with grounding language and strategic dialogue. To address this gap, we focus on journalistic interviews, a domain rich in grounding communication and abundant in data. We curate a dataset of 40,000 two-person informational interviews from NPR and CNN, and reveal that LLMs are significantly less likely than human interviewers to use acknowledgements and to pivot to higher-level questions. Realizing that a fundamental deficit exists in multi-turn planning and strategic thinking, we develop a realistic simulated environment, incorporating source personas and persuasive elements, in order to facilitate the development of agents with longer-horizon rewards. Our experiments show that while source LLMs mimic human behavior in information sharing, interviewer LLMs struggle with recognizing when questions are answered and engaging persuasively, leading to suboptimal information extraction across model size and capability. These findings underscore the need for enhancing LLMs' strategic dialogue capabilities.
Can Humans Oversee Agents to Prevent Privacy Leakage? A Study on Privacy Awareness, Preferences, and Trust in Language Model Agents
Zhang, Zhiping, Guo, Bingcan, Li, Tianshi
Language model (LM) agents that act on users' behalf for personal tasks can boost productivity, but are also susceptible to unintended privacy leakage risks. We present the first study on people's capacity to oversee the privacy implications of the LM agents. By conducting a task-based survey (N=300), we investigate how people react to and assess the response generated by LM agents for asynchronous interpersonal communication tasks, compared with a response they wrote. We found that people may favor the agent response with more privacy leakage over the response they drafted or consider both good, leading to an increased harmful disclosure from 15.7% to 55.0%. We further uncovered distinct patterns of privacy behaviors, attitudes, and preferences, and the nuanced interactions between privacy considerations and other factors. Our findings shed light on designing agentic systems that enable privacy-preserving interactions and achieve bidirectional alignment on privacy preferences to help users calibrate trust.
Efficient Model-agnostic Alignment via Bayesian Persuasion
Bai, Fengshuo, Wang, Mingzhi, Zhang, Zhaowei, Chen, Boyuan, Xu, Yinda, Wen, Ying, Yang, Yaodong
With recent advancements in large language models (LLMs), alignment has emerged as an effective technique for keeping LLMs consensus with human intent. Current methods primarily involve direct training through Supervised Fine-tuning (SFT) or Reinforcement Learning from Human Feedback (RLHF), both of which require substantial computational resources and extensive ground truth data. This paper explores an efficient method for aligning black-box large models using smaller models, introducing a model-agnostic and lightweight Bayesian Persuasion Alignment framework. We formalize this problem as an optimization of the signaling strategy from the small model's perspective. In the persuasion process, the small model (Advisor) observes the information item (i.e., state) and persuades large models (Receiver) to elicit improved responses. The Receiver then generates a response based on the input, the signal from the Advisor, and its updated belief about the information item. Through training using our framework, we demonstrate that the Advisor can significantly enhance the performance of various Receivers across a range of tasks. We theoretically analyze our persuasion framework and provide an upper bound on the Advisor's regret, confirming its effectiveness in learning the optimal signaling strategy. Our Empirical results demonstrates that GPT-2 can significantly improve the performance of various models, achieving an average enhancement of 16.1% in mathematical reasoning ability and 13.7% in code generation. We hope our work can provide an initial step toward rethinking the alignment framework from the Bayesian Persuasion perspective.
Investigating Conversational Search Behavior For Domain Exploration
Schneider, Phillip, Afzal, Anum, Vladika, Juraj, Braun, Daniel, Matthes, Florian
Conversational search has evolved as a new information retrieval paradigm, marking a shift from traditional search systems towards interactive dialogues with intelligent search agents. This change especially affects exploratory information-seeking contexts, where conversational search systems can guide the discovery of unfamiliar domains. In these scenarios, users find it often difficult to express their information goals due to insufficient background knowledge. Conversational interfaces can provide assistance by eliciting information needs and narrowing down the search space. However, due to the complexity of information-seeking behavior, the design of conversational interfaces for retrieving information remains a great challenge. Although prior work has employed user studies to empirically ground the system design, most existing studies are limited to well-defined search tasks or known domains, thus being less exploratory in nature. Therefore, we conducted a laboratory study to investigate open-ended search behavior for navigation through unknown information landscapes. The study comprised of 26 participants who were restricted in their search to a text chat interface. Based on the collected dialogue transcripts, we applied statistical analyses and process mining techniques to uncover general information-seeking patterns across five different domains. We not only identify core dialogue acts and their interrelations that enable users to discover domain knowledge, but also derive design suggestions for conversational search systems.
Fine-grained Early Frequency Attention for Deep Speaker Representation Learning
Hajavi, Amirhossein, Etemad, Ali
Deep learning techniques have considerably improved speech processing in recent years. Speaker representations extracted by deep learning models are being used in a wide range of tasks such as speaker recognition and speech emotion recognition. Attention mechanisms have started to play an important role in improving deep learning models in the field of speech processing. Nonetheless, despite the fact that important speaker-related information can be embedded in individual frequency-bins of the input spectral representations, current attention models are unable to attend to fine-grained information items in spectral representations. In this paper we propose Fine-grained Early Frequency Attention (FEFA) for speaker representation learning. Our model is a simple and lightweight model that can be integrated into various CNN pipelines and is capable of focusing on information items as small as frequency-bins. We evaluate the proposed model on three tasks of speaker recognition, speech emotion recognition, and spoken digit recognition. We use Three widely used public datasets, namely VoxCeleb, IEMOCAP, and Free Spoken Digit Dataset for our experiments. We attach FEFA to several prominent deep learning models and evaluate its impact on the final performance. We also compare our work with other related works in the area. Our experiments show that by adding FEFA to different CNN architectures, performance is consistently improved by substantial margins, and the models equipped with FEFA outperform all the other attentive models. We also test our model against different levels of added noise showing improvements in robustness and less sensitivity compared to the backbone networks.
Temporarily Unavailable: Memory Inhibition in Cognitive and Computer Science
Tempel, Tobias, Niederée, Claudia, Jilek, Christian, Ceroni, Andrea, Maus, Heiko, Runge, Yannick, Frings, Christian
Inhibition can take place at the level of neurotransmitters in the synaptic cleft, neurons can inhibit each other's fire rate, it can be s h own at a physiological level - for instance by measuring the EEG, and finally it can be investigated on a purely behavioral level. Behavioral inhibition typically means something like'making a content/action less accessible or suppressing it altogether' in order to enhance processing of relevant information . In cognition, thus, the concept of inhibition implies cognitive mechanisms that actively lower currently irrelevant or inter fering information. Psychological theories that posit the existence of inhibitory mechanisms in our mind have elicited much research across diverse fields of C ognitive P sychology like perception, attention, action control, and memory but have also been tra nsferred to other research fields like D evelopmental P sychology as, fo r instance, understanding the aging brain or the developing brain is closely linked to understanding how the brain handles irrelevant or interfering information - that is how or whether the brain can inhibit such information. The two areas in Cognitive Psychology in which inhibition is traditionally investigated to the largest extent are the research fields of attention and memory. In attention research, typically the interference due to distracting stimuli or actions is analyzed in experimental paradigms that try to tap a specific form of cognitive inhibition. For example, in the Negative Priming task (for a review, Frings, Schneider, & Fox, 2015) it is typically analyzed how an irrelevant distractor stimulus is inhibited. In the cuing task that elicits the inhibition of return effect (Posner, Choate, Rafal, & Vaughn, 1985) it is typically analyzed how an irrelevant location is inhibited. In task switchin g (Kiesel et al., 2010) lowering competition by a just previously performed task while currently executing a novel task is achieved by inhibiting that previous task.
Advanced Memory Buoyancy for Forgetful Information Systems
Jilek, Christian, Chwalek, Jessica, Schwarz, Sven, Schröder, Markus, Maus, Heiko, Dengel, Andreas
Knowledge workers face an ever increasing flood of information in their daily lives. To counter this and provide better support for information management and knowledge work in general, we have been investigating solutions inspired by human forgetting since 2013. These solutions are based on Semantic Desktop (SD) and Managed Forgetting (MF) technology. A key concept of the latter is the so-called Memory Buoyancy (MB), which is intended to represent an information item's current value for the user and allows to employ forgetting mechanisms. The SD thus continuously performs information value assessment updating MB and triggering respective MF measures. We extended an SD-based organizational memory system, which we have been using in daily work for over seven years now, with MF mechanisms directly embedding them in daily activities, too, and enabling us to test and optimize them in real-world scenarios. In this paper, we first present our initial version of MB and discuss success and failure stories we have been experiencing with it during three years of practical usage. We learned from cognitive psychology that our previous research on context can be beneficial for MF. Thus, we created an advanced MB version especially taking user context, and in particular context switches, into account. These enhancements as well as a first prototypical implementation are presented, too.
Learning the Nature of Information in Social Networks
Agrawal, Rakesh (Microsoft) | Potamias, Michalis (Groupon) | Terzi, Evimaria (Boston University)
We postulate that the nature of information items plays a vital role in the observed spread of these items in a social network. We capture this intuition by proposing a model that assigns to every information item two parameters: endogeneity and exogeneity. The endogeneity of the item quantifies its tendency to spread primarily through the connections between nodes; the exogeneity quantifies its tendency to be acquired by the nodes, independently of the underlying network. We also extend this item-based model to take into account the openness of each node to new information. We quantify openness by introducing the receptivity of a node. Given a social network and data related to the ordering of adoption of information items by nodes, we develop a maximum-likelihood framework for estimating endogeneity, exogeneity and receptivity parameters. We apply our methodology to synthetic and real data and demonstrate its efficacy as a data-analytic tool.